Technique for automatically tracking an object by a camera based on identification of an object

ABSTRACT

Automatic tracking by a camera of an object such as on-air talent appearing in a television show commences by first determining whether the object lies within the camera field of view matches a reference object. If so, tracking of the object then occurs to maintain the object in fixed relationship to a pre-set location in the camera&#39;s field of view, provided the designated object has moved more than a threshold distance from the pre-set location.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.15/637,166, filed Jun. 29, 2017, which is a continuation of U.S.application Ser. No. 12/736,227, filed Sep. 20, 2010, which is aNational Stage Entry of PCT/US2009/002286, filed Apr. 13, 2009, whichclaims priority under 35 U.S.C. § 119 to U.S. Provisional PatentApplication Ser. No. 61/124,094, filed Apr. 14, 2008, the entirecontents of each of which are hereby incorporated by reference in theirentirety.

TECHNICAL FIELD

This invention relates to a technique for tracking an object whose imageis captured by a camera or the like.

BACKGROUND

Live production of a television program such as a news show oftenrequires one or more television cameras to capture the image ofdifferent “on-air” talent, such as a news anchor, weather reporterand/or sports reporter. In the past, a camera operator would manuallyoperate each television camera. Such manual operation often entailedmoving the camera to different positions within a television studio tomake sure that the particular on-air talent appeared in the center ofthe camera's field of view. During broadcasting, the on-air talent oftenwill make slight lateral movements, forcing the camera operator todisplace the camera by a corresponding amount to maintain the on-airtalent within the center of the camera's field of view. The cameraoperator will generally observe the image of the on-air talent in thecamera's view finder so the operator will have immediate knowledge ofthe movement of the talent and move the camera accordingly.

Advances in technology have led to the development of robotic televisioncameras, such as the “Cameraman”, available from Thomson Grass Valley,Jacksonville, Fla. Such robotic cameras operate under the control of oneor more computers which manage functions such as camera displacementalong the x, y, and z axes, pan, tilt, zoom and focus. By appropriatelyprogramming the computer(s), the camera will operate automatically, thusobviating the need for manual control. Typical robotic cameras have theability to move from a known home position to one or more pre-setpositions, each pre-set position enabling a particular camera shot of anon-air talent. Generally, the pre-set camera positions remain static. Inother words, if the on-air talent moves even slightly to the right orleft while the robotic camera remains static, then the on-air talentwill appear off-center within the field of view of the camera.

To overcome this difficulty, robotic cameras can include automatictracking technology such as such the tracking system described in U.S.Pat. No. 5,668,629 issued in the name of Jeffrey Parker et al. Theautomatic tracking system described in the '629 patent employs aInfra-Red (BR) transmitter carried by the moving object (e.g., theon-air talent) for transmitting signals to an IR receiver carried by therobotic camera. By detecting the deviation in the signal transmitted bythe transmitter as it moves with the object, the IR receiver canestablish the new position of the moving object and provide thatinformation to the computer(s) controlling the robotic camera todisplace the camera accordingly.

The IR tracking technology described in the '629 patent works well fortracking a single moving object. However, tracking of multiple objectscan prove problematic, such as in the case when a single robotic cameraserves to capture the image of several different on-air talent, asoccurs when the camera moves to capture the image of a news anchor atone instant, and a weather reporter at a different instant. Eachdifferent on-air talent would need to carry a separate IR transmitter toavoid interference, thus necessitating the need for multiple IRreceivers on the camera. This IR system also suffers from thedisadvantage that the anchor person has to wear an embedded system thatshould be located at the center of the head to have an accurate estimateof the head position

Thus, a need exists for a tracking technique that overcomes theaforementioned disadvantage of the prior art.

SUMMARY OF THE INVENTION

Briefly, in accordance with a preferred embodiment, there is provided amethod for tracking an object within the field of view of a roboticallycontrolled camera. The method commences by first determining whether theobject lying within the camera field of view matches a reference object.If so, tracking of the object commences to maintain the object in fixedrelationship to a pre-set location in the camera's field of view,provided the designated object has moved more than a threshold distancefrom the pre-set location. In this way, tracking occurs in accordancewith the camera's field of view, and does not depend on any apparatusworn by the object being tracked.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a block schematic diagram of an exemplary system forpracticing the automatic tracking technique of the present principles;

FIG. 2 depicts a Graphical User Interface (GUI) through which anoperator can control the system of FIG. 1

FIG. 3 depicts an enlarged portion of the of the GUI of FIG. 2 showingthe manner in which an operator can manipulate a camera offset; and

FIG. 4 depicts in flow chart form the steps of a method practiced by theapparatus of FIG. 1 for performing automatic tracking technique of thepresent principles.

DETAILED DESCRIPTION

FIG. 1 depicts a block schematic diagram of an exemplary system 10 forautomatically tracking an object 12, such as on-air talent, inaccordance with the present principles. The on-air talent 12 could takethe form of a newscaster, sports reporter, or weatherman in connectionwith a production of a television news program, or on-air talent inconnection with other types of television programming (e.g., a game showhost).

The system 10 includes a robotic camera assembly 14, such as the“Camerman” robotic camera assembly available from Thomson Grass Valley,Jacksonville, Fla. The robotic camera assembly 14 typically includes atelevision camera 16 that carries a zoom lens 18 whose functions, suchas iris and zoom, respond to signals supplied by a processor 20, such asbut not limited to, a personal computer or the like. Thus, the lens 18has a variable zoom function. The processor 20 also controls a roboticcamera pedestal 22 which has the capability of displacing the camera 16along the x, and y axes as well as panning and tilting the cameraresponsive to signals from the processor. The processor 20 operates tocontrol the movement of the robotic camera pedestal 22 as well as thefunctions of the lens 18 in accordance with the video signal from thecamera 16. Although the robotic camera system 14 depicts a single camera16, the system could include multiple cameras controlled by a singleprocessor or by individual processors.

FIG. 2 depicts a display of a Graphical User Interface (GUI) 200 viawhich an operator enters data to, and receives information from, aprogram executed by the processor 20 to carry out automatic tracking ofan object (e.g., the on-air talent 12 of FIG. 1) in the manner describedhereinafter. The GUI 200 of FIG. 2 includes a video screen 202 whichdisplays the image of a selected television camera, such as the camera16 of FIG. 1. The image displayed in the video screen 202 includeshorizontal and vertical lines 204 a and 204 b, whose intersection 206represent an offset associated with the tracking technique of thepresent principles. The offset constitutes the difference in positionbetween the center of the object (e.g., the on-air talent 12 of FIG. 1)and the intersection 206 of the lines 204 a and 204 b. An operator canmanipulate the location of the lines 204 a and 204 b by touching anddragging the lines to manipulate the offset. The video screen 202 alsodisplays a “safe zone box”, in the form of a border 208 which definesthe region within which automatic tracking occurs. No tracking occursfor any object appearing outside the border 208. Thus, if the on-air 12of FIG. 1 appears outside of the border 208, the camera 16 will notrespond to movement of the on-air talent.

In addition to the video screen 202, the GUI 200 includes a plurality of“toggle buttons” 210-224, each taking the form of a particular regionwithin the GUI, which when activated, triggers a particular action asdescribed hereinafter. In practice, actuation of a particular one of thetoggle buttons 210-224 can occur by the use of a computer mouse (notshown). Alternatively, the GUI 200 could undergo display on a touchscreen so that touching the particular toggle button would trigger thecorresponding action associated with that button. The toggle button 210triggers selection of a particular one of several cameras, whereas thetoggle button 212 selects a preset shot for the camera selected by thetoggle button 210. Toggle button 214 triggers an edit capability toallow the operator to adjust various parameters, including but notlimited to the speed of camera movement. In this way, the operator canadjust the sensitivity of the automatic tracking. Toggle button 216triggers a new tracking session.

Toggle button 219 triggers a save of the various settings and otherinformation associated with a current tracking session, including butnot limited to related safe zone settings for particular preset cameralocations. Toggle button 218 enables automatic tracking of an object(e.g., the on-air talent 12 of FIG. 1) in accordance with the method ofthe present principles. Toggle button 240 enables creation of a safezone defined by the border 208 to define a region outside of which notracking will occur. Toggle button 222, when actuated, initiatesautomatic tracking, by entering into an “auto find” mode, whereupon theprocessor 20 of FIG. 1 will search the currently selected camera's fieldof view for a suitable object to begin tracking. Toggle button 222automatically enables both automatic tracking and the Safe Zone withoutoperator intervention. Lastly, toggle button 224, when actuated,triggers a help screen to assist the operator.

The GUI 200 advantageously enables an operator to set a tracking window(i.e., the border 208) as well as setting of x and y offsets (as definedthe intersection 206 of the lines 204 a and 204 b in FIG. 2). In thisway, the operator can maintain the object (the on-air talent 12 ofFIG. 1) in a particular perspective, depending on graphics that appearin the same field of view as the on-air talent. For example, thegraphics could appear over the right or left shoulder of the on-airtalent 12 of FIG. 1, as indicated in the image depicted in the videoscreen 202 of FIG. 2, resulting in a “right OTS” or “left OTS” shot.Upon operator selection of the automatic tracking function followingactuation of the auto track toggle button 218, the video screen 202within the GUI 200 will display the image of the camera 16 of FIG. 1with the current position of the offset. As described previously, theoperator can make adjustments by touching the lines 204 a and 204 b anddragging them to the desired location. After saving the position of thelines 204 a and 204 b as a preset, the intersection 206 now becomes thex and y offset associated with that particular location preset. Thecamera 16 of FIG. 1 will track the object (e.g., the on-air talent 12 ofFIG. 1) and re-adjust the position of the camera based on the differencebetween the stored offset and the location preset without operatorintervention. FIG. 3 represents an enlarged view of the video screen 202of FIG. 2 and more clearly depicts a tracking window having an “offset”from the center of the object in the field of view of the camera 16 ofFIG. 1.

FIG. 4 depicts in flow chart form the steps of an exemplary process 400by which the processor 20 of FIG. 1 can control the robotic cameraassembly 14 of FIG. 1 to carry out automatic tracking of the on-airtalent 12 of FIG. 1 in accordance with the present principles. Theauto-tracking method 400 commences by first executing step 402 to createor re-set an object for tracking. Initial execution of step 400 servesto create an “empty” object. For tracking purposes, an object possessescertain characteristics, such as a shape and location as well as certaincontent-based characteristics, such as color and feature points forexample. Initially, all of the object characteristics have zero values.

Execution of step 400 also serves to reset the position of the camera 16of FIG. 1 in the x, y and z coordinate to locate the camera to apre-defined (e.g., a pre-set) position. Similarly, the pan, tilt, zoomand iris are set to pre-defined values.

Following step 402, execution of step 404 occurs whereupon the processor20 detects the object (e.g., the on-air talent 12 of FIG. 1) bycomparing characteristics of the image (e.g., color feature points etc.)in a current video frame captured by camera 16 of FIG. 1 to a storedimage of the object. Upon detecting the object (which occurs when thecharacteristics of the captured frame substantially matches thecorresponding characteristics of the stored image), the processor 20executes step 406 to determine stability of the object. Upon failing todetect the object, step 404 undergoes re-execution upon capture of thenext video frame. In practice, step 404 will undergo re-execution todetect the object for a succession of captured video frames untilreaching a time-out interval to avoid the execution of an endless loop.Although not shown in FIG. 2, an operator could intervene at this pointto either continue object detection, or end the process.

Tracking of the object (i.e., displacement of the camera) generallyrequires that the object remain stable. In other words, the objectshould not undergo significant motion when attempting automatictracking. Attempting automatic tracking while the object undergoessignificant motion could result in movement of the camera 20 to alocation from which the object has already moved, which could lead tothe camera 16 of FIG. 1 “chasing” the object. To avoid such apossibility, the operator will typically select an interval during whichthe object must remain generally at the same position before theprocessor 20 will initiate movement of the camera 16 of FIG. 1. If theobject generally substantially motionless for the specified interval,then the object remains stable for purposes of determining stabilityduring step 406. The object stabilization step occurs because at theinitial step the camera moves in open loop (i.e. no images are processedduring this time). This initial displacement can take one second or moreto reach the desired preset position (the zoom command is not that fast)and when the camera finally converges to this position the object thatwas still moving can be far away from this position leading to an objecttracking failure or to a new very important camera displacement that isnot the behavior desired.

If the processor 20 of FIG. 1 finds the object stable during step 406,then the processor displaces the camera 16 of FIG. 1 to the desiredpre-set position, and likewise commands the lens 18 of FIG. 1 to zoom toa desired pre-set position during step 408. The operator can changethese parameters using the preset modification ability available in theGUI 200 of FIG. 2. For each preset, the operator can modify the locationof center of the captured image and image size. The operator can alsochange the preset using the preset selector of the GUI 200. During step410, processor 20 updates the object characteristics and resets theobject position counter used for stability determination purposes tozero. In particular, the processor 20 of FIG. 1 updates the objectcharacteristics by establishing the position of the object in thecurrent image. The object's characteristics include its shape, forexample a rectangle or an ellipse). Using the shape information, theprocessor 20 extracts content-based characteristics for tracking theobject. In the event of an inability to detect object stability duringstep 406, then process execution branches back to step 404.

Following step 410, the processor 20 of FIG. 1 executes step 412 todetect whether object tracking occurs with sufficient confidence. Objecttracking occurs with sufficient confidence when the actual position ofthe object as detected from its characteristics lies with a givenprobability of its expected position, denoting the tracking confidence.An example of a tracking technique suitable for tracking objects isdescribed infra. If the tracking confidence equals or exceeds a giventhreshold, the processor 20 of FIG. 1 assumes successful tracking andthen proceeds to execute step 418 to test convergence. Otherwise, if thetracking confidence does not equal or exceed the threshold, then theprocessor 20 assumes the object to be lost.

Under such circumstances, process execution branches to step 414 to lookfor the object, using the position of the object in the previous frameas a reference position. The processor 20 looks for the objectthroughout the overall image, typically in a random manner by enlargingimage sampling. A check then occurs during step 416 to determine whetherthe object has been found. To determine if it has found the object, theprocessor 20 checks whether the distance between the objectcharacteristics and the object candidate characteristics remains lowerthan half of the tracking confidence. If so, then process executionbranches back to step 412 to check for successful tracking. Otherwise,step 414 undergoes re-execution until the processor 20 of FIG. locatesthe object. To avoid an endless loop, the process 400 could time outafter a given interval in the absence of not finding the object. Notethat the operator can change the tracking confidence in real time viathe GUI 200 of FIG. 2.

Upon execution of step 418 of FIG. 4, the processor 20 of FIG. 1determines convergence by determining if the position of the objectcorresponds to the desired pre-set position. At each instant in time,the object will have a convergence state, either TRUE or FALSE,depending on whether the distance between the actual position of theobject and the desired pre-set position does not exceed a thresholdvalue. Initially, the object has a FALSE convergent state. Upondetecting a FALSE convergent state, the processor 20 launches a test ofconvergence. If convergence state remains FALSE when checked during step418, then step 420 undergoes execution, whereupon the processor 20causes the camera 16 to move to a selected preset position. Theprocessor 20 can separately control the pan and tilt speed, with thedirection determined by using different values for pan and tilt speed.An operator can change the magnitude of the camera speed via the GUI 200of FIG. 2.

To avoid the possibility of shaking caused by the camera 16 of FIG. 1rapidly moving back and forth over a short distance during tracking, theprocessor 20 performs a tolerance check during step 422 following adetermination during step 418 of a TRUE convergence state. During step422, the processor 20 checks for tolerance by making use of a toleranceradius about each preset position. If the distance between the desiredpre-set position and the current object position remains less than thetolerance radius, then no further movement of the camera 16 of FIG. 1becomes necessary and the process ends at step 424. Otherwise, if theobject (e.g., the on-air talent 12 of FIG. 1) lies outside the toleranceradius, then the processor 20 resets the convergence state to FALSE andstep 420 undergoes re-execution to move the camera 16 to match objectposition and desired preset position.

An example of a tracking technique includes, e.g., a method of imageprocessing adapted for tracking any general moving object marked by auser in a sequence of images. An example of such a method is randomizedtemplate tracking method. In the starting frame of the video sequence, ahand marked object or a detected object is identified. Then, in the sameframe, a set of N template locations is drawn using, for example, auniform number generation. A template of predefined maximum size isassociated to each of these locations. The templates, {Ti}=_(i=1) ^(i=N)are possibly trimmed in order that each of them lies within the markedobject boundary. The following steps are then applied sequentially inthe same order on every frame following the starting frame unless atracking failure is signal.

Each template Ti is tracked in the current frame using for example acommon normalized cross-correlation tracking method in a predefinedsearch space. Such a method is further described in an article from J.P. Lewis entitled “Fast Template Matching” and published in VisionInterface in 1995 on p. 120-123. This method associates every templatewith a correlation surface. The grid location corresponding to themaximum of the correlation surface is the estimated new location of thetemplate. Consequently a translation vector, denoted Vj, is derived foreach template Tj in the set.

A subset of these templates is then possibly discarding in the currentframe, i.e. outliers are removed from the set {Vi}_(i=1) ^(i=N). Thestep 12 is referred as rejection control step on FIG. 1. A robustclustering step is used to derive a first subset of set {Vi}_(i=1)^(i=N). To this aim, each motion vector in the set {Vi}_(i∈I) comparedwith the remaining vectors using an Euclidean distance measure. Acorresponding bin-count Bi is incremented for every vector that lieswithin a small predefined clustering radius r of this vector. All bincounts are set to 1 initially for all the templates. A subset {Ti}_(i=1)^(i=N), |I|≤N of the set {Vi}_(i=1) ^(i=N) formed by selecting all themotion vectors with associated bin-count Bi≥N/2. If N is odd therelationship is defined as Bi≥(N+1)/2. A second subset is then derived.To this aim, the two-dimensional mean of the first subset {Vi}_(i∈I) andthe resulting covariance matrix denoted as {μ,Σ} are computed. Forcomputation sake, the cross variances are assumed to be zero. A Gaussiandistribution g(V) is defined in motion space with these parameters. Aweight wi=g(Vi) is assigned to each motion vector in the set {Vi}_(i=1)^(i=N). The set {Ti,wi}_(i=1) ^(i=N) is then sorted in a descendingorder of weights. The weight of the element with index (N/2)−1, if N iseven, or (N−1)/2, if N is odd is selected. This weight is denoted as C.Each template is then assigned a probability pi=min{1.0,Wl/C}. Templatesare finally discarded based on a threshold on this probability, i.e.templates with an assigned probability lower than a predefined thresholdare discarded. At the end of this step, only M templates are retained.This subset is denoted {Ti}_(i=1) ^(i=M), where M≤N.

It is checked if the tracking is successful or not. The tracking isconsidered successful if at least N/2 (i.e. M>N/2), if N is even, or(N+1)/2 (i.e. M>(N+1)/2), if N is odd, templates are retained, and isconsidered failing otherwise.

If the tracking is successful, the correlation surfaces of the retainedtemplates are combined into a probability distribution p(x) from whichnew templates are resampled to replenish the template set.

With {Ti}_(i=1) ^(i=M), the target state x_(T) is estimated to be themean of these template locations. The target state is the location ofthe target object on the image grid. The minimum bounding rectangle isalso constructed, denoted Bmin, around these templates, which acts asthe spatial bound for the state distribution computed below. Givenx_(T), the set of correlation surfaces associated with {Ti}_(i=1) ^(i=M)is translated to this location, sum and normalized, all within Bmin, toresult in a probability distribution p(x). We consider p(x) as thedistribution of target state x_(T) generated by the randomized templatetracking method. N-M template locations, denoted yk, are sampled fromp(x) as follows:

y _(k) ≈p(x), 1≤k≤(N−M)

To each sampled location in the set {y_(k)}, Gaussian noise is added toincrease sample diversity as shown below:

ŷ _(k) =y _(k)+η(0,σ²)

Finally, to each sample location y_(k) an image patch is associatedaround it as described below. In absence of a natural scale estimationscheme here, we are forced to remain conservative in choosing thedimensions of the new template. The chosen template is trimmed to fitαBmin, α≥1. The value α=1 is the most conservative estimate which weemploy when tracking very small sized objects in aerial videos and thelike. Here a must not be misconstrued to be a scale estimate of thetracked object.

If the tracking failed, all templates in the set {Ti}_(i=1) ^(i=N) areretained and their positions are extrapolated by the last successfullyregistered object translation, meaning the translation of the targetstate at the step in the past when tracking was signaled as successful.A target state x_(T) is however estimated, at step 23, mainly fordisplay purpose. In this case the target state x_(T) is estimated to bethe mean of the template locations after extrapolation. The control isthen passed to when the next frame arrives. Such a scheme is found to beuseful in handling very short occlusions of a few frames.

Probabilistic tracking methods are also known from the prior art. Someof them are based on particle filtering. An example of such a colormodel based particle filtering method is given in the document from P.Perez et al. entitled “Data fusion for visual tracking with particles”and published in Proc. IEEE, 92(3):495-513, 2004.

For example, initially, the particle positions denoted {xi}_(i=1) ^(i=K)in the current frame are predicted from the positions estimated in thepast frame using for example random walk approach. A weight denotedπ_(i) is also computed for each particle position by matching the localhistogram around the particle to the constant color histogram model ofthe target object stored in a memory from the first frame. This weightis higher if the location match the color model. The weights {π_(i)} sumto unity {πi,xi}_(i=1) ^(i=K) defines a probability distribution fromwhich the target state xc, i.e. the location of the target object on theimage grid.

A rejection probability is computed. This rejection probability isfurther used to make a binary decision for the color based trackingmethod. To this aim, the covariance matrix Cπ of the distribution{xi,1/K}_(i=1) ^(i=K) are computed. Then the determinants of thesecovariance matrices are computed. From the property of determinants weknow that this scalar quantity measures the effective volume of theparallelopiped constructed by the row vectors of the matrix. Thismotivates the use of this quantity as a scalar confidence measure todetermine if the tracking is successful or not. With the abovenotations, the rejection probability pr is defined as follows:

${pr} = {\min \left\{ {1.0,\frac{\det \left\lbrack {C\; \pi} \right\rbrack}{\det \lbrack{CS}\rbrack}} \right\}}$

Pr tends to 1 as the uncertainty in the distribution increases and tendstowards 0 as the distribution becomes more peaked. It is interesting tonote that it can be inconsistent to analyze the performance of thefilter based solely on evolution of its covariance over time. This isbecause the spread (covariance) of the samples at each tracking step isnot constant and even with resampling there is bound to be somefluctuations. Therefore, it is necessary to account for this variablespread via a factor like Cs.

The tracking is considered successful if pr is less than an empiricalthreshold and is considered failing otherwise.

If the tracking is successful, the filtering distribution is resampledand the target state xc is estimated to be the mean of the distributionestimated.

If the tracking failed, the resampling of the filtering distribution isstopped. This causes the samples to be more and more spread at eachtracking step. In the absence of clutter, the sample weights tend to beuniform which results in the probability pr tending to 1. But once asubset of the samples gains distinctly more weight (say in a relockscenario after a few frames of occlusion) a few frames later, therejection probability pr tends towards 0 leading to a success signal. Inthis case; the target state xT is estimated to be the mean of thefiltering distribution.

The particle filtering recommences when the next frame arrives.

Image processing adapted for tracking an object in a sequence of imagesmay comprise the following steps applied on each image of the sequence:

-   -   determining the location of N templates in the current image on        the basis of locations of the N templates in a preceding image,        with N an integer, the step being called first tracking step;    -   determining, according to a first predefined criteria, if the        first tracking step is successful or is not successful; and    -   determining a probability distribution, called first probability        distribution, representing the probability distribution of the        object location in the current image. The method further        comprises the following steps:    -   determining the location of K particles in the current image on        the basis of locations of the K particles in the preceding        image, with K an integer, and assigning each particle a weight        representing the level of match between an observation made for        the particle in the current image and a predefined model, the K        particles' locations and associated weights defining a second        probability distribution, the step being called second tracking        step;    -   determining, according to a second predefined criteria, if the        second tracking step is successful or is not successful;    -   if the first tracking step is not successful and the second        tracking step is successful:    -   determining the location of N new templates in the current image        based on the second probability distribution;    -   replacing the N templates in the current image with the N new        templates;    -   determining the location of the object in the current image        based on the second probability distribution; and    -   if the first tracking step is successful and the second tracking        is not successful: —determining the location of K new particles        in the current image based on the first probability        distribution;    -   replacing the K particles in the current image with the K new        particles and assigning each particle a same predefined weight;    -   determining the location of the object in the current image        based on the first probability distribution; and    -   if both first and second tracking steps are successful or if        both first and second tracking steps are not successful        determining the location of the object in the current image on        the basis of the first probability distribution or on the basis        of the second probability distribution.

The predefined model may be a color histogram of the object and theobservation may be a color histogram computed in a patch centered on theparticle location and the predefined weight equals 1 divided by K.

The first tracking step may comprise:

-   -   determining, for each template of the preceding image, an image        area in the current image, called correlation surface, whose        correlation with a previously defined correlation surface        centered on the template is maximum; and    -   determining the template positions in the current image as the        centers of the correlation surfaces in the current image.

The method may further comprise a step for assigning a priority level tothe first tracking step and to the second tracking step, and:

-   -   if the priority level of the first tracking step is higher than        the priority level to the second tracking step, then:    -   if the first tracking step is not successful and the second        tracking step is successful, switching the priority levels        between the first tracking step and the second tracking step;    -   if the first tracking step is successful and the second tracking        is not successful, leaving the priority levels unchanged;    -   if both first and second tracking steps are successful or if        both first and second tracking steps are not successful        determining the location of the object in the current image on        the basis of the first probability distribution and leaving the        priority levels unchanged; and    -   if the priority level of the first tracking step is lower than        the priority level to the second tracking step, then:    -   if the first tracking step is not successful and the second        tracking step is successful, leaving the priority levels        unchanged;    -   if the first tracking step is successful and the second tracking        is not successful, switching the priority levels between the        first tracking step and the second tracking step;    -   if both first and second tracking steps are successful or if        both first and second tracking steps are not successful        determining the location of the object in the current image on        the basis of the second probability distribution.

A tracking method may combine two known tracking methods. As an example,the two tracking methods described may be used. However, any othertracking method may be used provided that one of them relies on anadaptive set/catalogue model of the object and the other relies on aconstant holistic model like a color histogram of the object. In thefollowing, the two methods are referred as method T and method C, whereT stands for template tracking and C for color based tracking. Themethod C may be replaced by any particle filtering based trackingmethod, i.e. not necessarily color based.

At the initialization stage of tracking, a priority denoted P isassigned to one of the two tracking methods. This priority P is thenswitched between the two methods depending on their current signs ofsuccess or failure of the methods at each tracking step. Furthermore, ateach instant the method that has the priority makes an estimation of thetarget state denoted X.

Both methods are run independently without any interaction orinformation exchange except possibly at the very end. Therefore, bothmethods may run in parallel.

As a first example, it is assumed that at the initialization stagemethod T has been given the priority, i.e. P=T.

-   -   if T is successful, the target state X is made from T;    -   otherwise (i.e. if T failed) it is checked if C is successful:        -   if C is successful, the cross sampling described below            applies, P is set to C and the target state X is made from            C;        -   otherwise (i.e. if C failed), resampling is interrupted and            the target state is made from T.    -   if C is successful target state is estimated from the T side,        (if T is also successful) or from the C side (if T failed);    -   otherwise:        -   if T also failed, resampling is interrupted at step 34 but            target state is made from T since P=T;        -   Otherwise, cross-sampling applies, and the target state is            estimated from T.

As a second example, it is assumed that at the initialization stagemethod C has been given the priority, i.e. P=C.

-   -   if C is successful, the target state X is made from C;    -   otherwise (i.e. if C failed) it is checked if T is successful:        -   if T is successful, the cross sampling step 38 described            below applies, P is set to T and the target state X is made            from T;        -   otherwise (i.e. if T failed), resampling is interrupted and            the target state is made from C;

It should be understood that even if P=C:

-   -   if T is successful target state is estimated from the C side,        either (if C is also successful) or from the T side (if C        failed);    -   otherwise: if C also failed, resampling is interrupted, but        target state is made from C since P=C,    -   Otherwise, cross-sampling applies, and the target state is        estimated from C.

At cross-sampling steps both tracking methods interact and exchangeinformation in only one direction, i.e. the successful tracking methodprovides information in the form of a probability distribution oflocations of the target object to the other tracking method. There aretwo types of cross sampling. Both steps are further detailed below.

If P=T and method T is successful, then the state estimate as statedearlier is made from method T. If at some later instant method T failsbut method C is successful, possibly when there is partial occlusionand/or jerky out of the image plane rotation. The target state is nowestimated from method C, the priority switched to method C and thefollowing cross sampling applies. The entire template set is discarded,i.e. are erased from memory, and a new set of N template locations aresampled from the color filtering distribution (i.e. {πi,xi}_(i=1)^(i=K)) at the corresponding instant and each location assigned an imagetemplate, e.g. a rectangle of a predefined size. The templates are thentrimmed, if necessary, to fit the bounds of the object as decided bymethod C. Therefore, the trimming step may change the size of eachtemplate. It is also to be noticed that the past object model composedby the templates is totally discarded and updated at the current instantfrom the color distribution. The step gives the template tracking methodon the whole a new set/catalogue model of the target object.

If P=C and method C is successful, then the state estimate as statedearlier is made from method C. Now say at some later instant method Cfails but method T is successful, typically when there are extremeillumination changes and/or nearby clutter. The state estimate is nowmade from the method T, the priority switched to method T and thefollowing cross sampling applies. The current samples {πi,xi}_(i=1)^(i=K) which compose the filtering distribution are discarded, i.e.erased from memory, and replaced by new samples drawn from the targetstate distribution p(x) defined by the combination of correlationsurfaces output by method T. Each sample is weighted equally (i.e. thenew weights are all set to 1/K to result in the distribution given as

$\left( \left\{ {\frac{1.0}{K},{xi}} \right\}_{i = 1}^{i = K} \right).$

The color model, i.e. color histogram, is however not updated. Theconstant color model of the target object derived from the first frameis stored in a memory and is never updated.

If both methods T and C failed at the same tracking stage (i.e. on thesame image), then the resampling step in both tracking method areinterrupted. In this case we resort to both interrupting resampling.This case may occur in a complete occlusion scenario where the occludingobject shares no commonality with the target reference model.

According to another embodiment, no priority is set, but T is the maintracking method and C is used to update its state distributionespecially when T failed and C is successful. This solution isequivalent to set P=T and to never update P.

In a third embodiment, no priority is set, but C is the main trackingmethod and T is used to update its state distribution especially when Cfailed and T is successful. This solution is equivalent to set P=C andto never update P.

The target model which is employed in the method according to theinvention is a multi-part model. Advantageously, the method of theinvention does not require a single fused target state distribution butuses instead two target state distributions: one coming from thetemplate tracking method and the other one coming from the color basedparticle filtering method. The first part is composed of the gray leveltemplates. Due to the sequential resampling procedure, at any trackinginstant, the age of each element in the template set is possiblydifferent from the rest. Therefore, this set consists of templateshaving lifespans of a few to several tens of frames and thus plays therole of the dynamically adapted part of the entire target model. Thesecond part of the model is the constant color histogram of the targetobject. This is the static part of the two part target appearance modeland does not interact with the first part. The histogram is deliberatelykept constant to avoid false adaptation due to illumination, orientationand size changes.

The foregoing describes a technique for automatically tracking anobject.

What is claimed:
 1. A system for automatically tracking objects in avideo feed generated, the system comprising: a camera configured togenerate the video feed for the production of a television program; anobject detector configured to compare an image characteristic the videofeed with a stored image of an object to detect whether the object iscaptured in the video feed; a tolerance determiner configured tocalculate whether a distance between a pre-set position in a field ofview of the camera and the detected object therein is within a toleranceradius; and an object tracker configured to automatically control arobotic camera pedestal to physically adjust a positioning of the cameraand a corresponding position of the object within the field of view ofthe camera when the distance between the pre-set position and thedetected object is outside the tolerance radius.
 2. The system accordingto claim 1, further comprising a tracking window setting moduleconfigured to define a tracking window in the field of view of thecamera based on an operator defined region of interest received by auser interface, such that the system will not track the object in thevideo feed when the object is outside the defined tracking window. 3.The system according to claim 2, further comprising a tracking locationsetting module configured to define the pre-set position in the field ofview of the camera for the identified object; and wherein the objecttracker is configured to automatically track the identified object bycontrolling the robotic camera pedestal when the camera generates thevideo feed to maintain the identified and tracked object in a fixedrelationship relative to the defined pre-set position in the field ofview of the camera.
 4. The system according to claim 1, wherein theobject tracker is configured to automatically control the robotic camerapedestal by displacing the camera in an X and Y axis to physicallyadjust the positioning of the camera.
 5. The system according to claim1, wherein the object tracker is configured to automatically control atleast one of a panning operation and a tilting operation of the camerato physically adjust the positioning of the camera.
 6. The systemaccording to claim 1, wherein the object detector includes a processorconfigured to execute instructions stored in memory to compare the imagecharacteristic with the stored image, the tolerance determiner includesa processor configured to execute instructions stored in memory tocalculate whether the distance between the pre-set position in the fieldof view and the detected object therein is within the tolerance radius,and the object tracker includes a processor configured to executeinstructions stored in memory to automatically control the roboticcamera pedestal to physically adjust the positioning of the camera. 7.An apparatus for automatically tracking objects in a video feedgenerated by a camera, the apparatus comprising: a memory; and aprocessor configured to implement instructions stored on the memory soas to provide: an object detector configured to compare an imagecharacteristic in frames of the video feed with a stored image of anobject to detect the object in the video feed; a tolerance determinerconfigured to calculate whether a distance between a pre-set position ina field of view of the camera and the detected object therein is withina tolerance radius; and an object tracker configured to automaticallycontrol a robotic camera pedestal to physically adjust a positioning ofthe camera and a corresponding position of the object within the fieldof view of the camera when the distance between the pre-set position andthe detected object is outside the tolerance radius in at least oneframe of the video feed.
 8. The apparatus according to claim 7, furthercomprising the camera configured to capture the frames of the video feedfor the production of a television program.
 9. The apparatus accordingto claim 7, wherein the processor is configured to implementinstructions stored on the memory so as to provide a tracking windowsetting module configured to define a tracking window in the field ofview of the camera, such that the system will not track the object inthe video feed when the object is outside the defined tracking window.10. The apparatus according to claim 9, wherein the tracking windowsetting module is configured to define the tracking window based on anoperator defined region of interest received by a user interface. 11.The apparatus according to claim 9, wherein the processor is configuredto implement instructions stored on the memory so as to provide atracking location setting module configured to define the pre-setposition in the field of view of the camera for the identified object.12. The apparatus according to claim 11, wherein the object tracker isconfigured to automatically track the identified object by controllingthe robotic camera pedestal when the camera captures the frames of thevideo feed to maintain the identified and tracked object in a fixedrelationship relative to the defined pre-set position in the field ofview of the camera.
 13. The apparatus according to claim 7, wherein theobject tracker is configured to automatically control the robotic camerapedestal by displacing the camera in an X and Y axis to physicallyadjust the positioning of the camera.
 14. The apparatus according toclaim 7, wherein the object tracker is configured to automaticallycontrol at least one of a panning operation and a tilting operation ofthe camera to physically adjust the positioning of the camera.
 15. Asystem for automatically tracking objects in a video feed generated by acamera, the system comprising: an object detector configured to comparean image characteristic in frames of the video feed with a stored imageof an object to detect the object in the video feed; a tolerancedeterminer configured to calculate whether a distance between a pre-setposition in a field of view of the camera and the detected objecttherein is within a tolerance radius; and an object tracker configuredto automatically control a robotic camera pedestal to physically adjusta positioning of the camera and a corresponding position of the objectwithin the field of view of the camera when the distance between thepre-set position and the detected object is outside the tolerance radiusin at least one frame of the video feed.
 16. The system according toclaim 15, further comprising the camera configured to capture the framesof the video feed for the production of a television program.
 17. Thesystem according to claim 15, further comprising a tracking windowsetting module configured to define a tracking window in the field ofview of the camera, such that the system will not track the object inthe video feed when the object is outside the defined tracking window.18. The system according to claim 16, wherein the tracking windowsetting module is configured to define the tracking window based on anoperator defined region of interest received by a user interface. 19.The system according to claim 17, further comprising a tracking locationsetting module configured to define the pre-set position in the field ofview of the camera for the identified object; and wherein the objecttracker is configured to automatically track the identified object bycontrolling the robotic camera pedestal when the camera captures theframes of the video feed to maintain the identified and tracked objectin a fixed relationship relative to the defined pre-set position in thefield of view of the camera.
 20. The system according to claim 15,wherein the object tracker is configured to automatically control therobotic camera pedestal by displacing the camera in an X and Y axis tophysically adjust the positioning of the camera.
 21. The systemaccording to claim 15, wherein the object tracker is configured toautomatically control at least one of a panning operation and a tiltingoperation of the camera to physically adjust the positioning of thecamera.
 22. The system according to claim 15, wherein the objectdetector includes a processor configured to execute instructions storedin memory to compare the image characteristic with the stored image, thetolerance determiner includes a processor configured to executeinstructions stored in memory to calculate whether the distance betweenthe pre-set position in the field of view and the detected objecttherein is within the tolerance radius, and the object tracker includesa processor configured to execute instructions stored in memory toautomatically control the robotic camera pedestal to physically adjustthe positioning of the camera.